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Among  the  responsibilities  assigned  to  the  Office  of  the  Manager,  National 
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1.0  Introduction 


This  document  summarizes  work  performed  by  Delta  Information 
Systems,  Inc.  (DIS),  for  the  National  Communications  System,  an 
organization  of  the  U.  S.  Government,  under  Task  2.0  of  the 
Modification  P00007  of  Contract  DCA100-83-C-0047 .  The  purpose  of  this 
task  was  to  investigate  the  efficiency  and  potential  operational 
effectiveness  of  the  Scan  Line  Difference  Compression  (SLDC)  algorithm 
presented  in  Appendix  A.  Under  this  task,  DIS  performed  a  software 
simulation  study  of  the  SLDC  algorithm.  The  software  was  run  using 
selected  CCITT  binary  standard  images  as  input,  and  the  results  were 
compared  with  those  of  the  MODREAD  II  compression  algorithm  run  with 
the  same  input  data.  The  MODREAD  II  algorithm  was  chosen  for  the 
comparison  because  it  is  the  most  effective  compression  technique 
currently  available. 

DIS  has  analyzed  the  SLDC  algorithm  and  has  generated  tapes 
containing  the  Conditioned  Image  Files  (CIF's)  for  the  following 
twelve  (12)  combinations  of  the  binary  images  and  resolutions  that 
were  processed: 


Image 

CCITT  Image  #1 
CCITT  Image  #5 
CCITT  Image  #7 


Resolution 

200  lines/inch 
240  lines/inch 
300  lines/inch 
400  lines/inch 


Each  Subtask  of  Task  2.0  is  presented  in  a  section  of  this 


report.  Section  2.0  discusses  the  first  subtask,  in  which  Delta 
Information  Systems  studied  a  number  of  scan  line  conditioning 
techniques  and  generated  the  CIF ' s  and  corresponding  prediction 
statistics  using  the  technique  selected  in  the  study.  In  Section  3.0, 
the  second  subtask  is  presented,  in  which  the  FORTRAN  software  modules 
associated  with  the  implementation  the  SLDC  encoding  algorithm  are 
described  both  narratively  and  with  structure  charts,  data  flow 
charts,  and  software  specifications. 

Section  4.0  includes  the  third  subtask,  in  which  the  software 
modules  associated  with  the  SLDC  decoding  algorithm  are  described; 
again,  the  software  is  described  both  narratively  and  with  structure 
charts,  data  flow  charts,  and  software  specifications.  The  fourth 
subtask,  in  which  Delta  Information  Systems  ran  the  SLDC  encoding 
program  on  each  image  with  various  sets  of  design  parameters,  is 
presented  in  Section  5.0.  The  compression  statistics  for  all  168 
simulation  runs  performed  by  Delta  Information  Systems  are  presented 
in  this  section. 


1 


2 


2.0  Conditioned  Image  File  Generation  and  Analysis 


The  objective  of  line  conditioning  is  to  increase  the 
compressibility  of  the  image  under  the  constraint  that  the  original 
image  can  be  reconstructed  from  the  CIF  without  distortion.  Delta 
Information  Systems  analyzed  a  number  of  scan  line  conditioning 
techniques  in  this  subtask  and  generated  twelve  Conditioned  Image 
Files  (CIF's),  one  for  each  image/resolution  combination,  with  the 
technique  selected.  An  example  of  a  CIF  is  presented  in  Figure  2.1; 
the  test  image  from  which  it  was  generated  appears  in  Figure  2.2. 

The  conditioning  technique  selected  by  DIS  predicts  the  binary 
state  of  an  image  element  based  on  a  weighted  average  of  four  of  its 
close  neighbors  and  produces  a  file  of  predictions,  where  correct 
predictons  are  'O's  and  incorrect  predictions  are  'l's.  Because  each 
predicted  element  is  based  on  the  state  of  four  neighboring  elements 
(see  Section  A-2.1  in  Appendix  A),  the  predictor  conditioning 
algorithm  is  completely  described  by  a  sixteen-entry  state  table;  the 
state  table  employed  in  this  study  is  presented  in  Table  2.1. 

It  was  originally  anticipated  that  each  CCITT  image  would  produce 
a  different  state  table;  however,  this  was  not  the  case.  DIS 
generated  CIF  prediction  statistics  for  each  file  of  the  input  image 
set.  The  results  that  were  tabulated  included  the  frequency  of 
occurrence  of  each  state  together  with  the  frequency  of  occurrence  of 
a  'O'  pel  for  that  state;  this  data  revealed  that  all  of  the  input 
images  produced  the  same  predictor  conditioning  algorithm,  the  one 
represented  by  the  state  table  in  Table  2.1. 
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Table  2. 


-  Predictor  Conditioning 


A  B  C  D 


0  0  0  0  0 

0  0  0  1  1 

0  0  10  0 

0  0  11  1 

0  10  0  0 

0  10  1  1 

0  110  1 

0  111  1 

1  0  0  0  o 

10  0  1  o 

10  10  0 

10  11  1 

110  0  0 

110  1  1 

1110  0 

1111  1 
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DIS  generated  run  length  statistics  using  the  selected  predictor 
conditioning  algorithm.  For  each  CIF,  a  set  of  statistics  was 
tabulated;  the  statistics  generated  include  the  following; 

1.  Number  of  bits  in  CIF; 

2.  Weight  of  CIF; 

3.  Histogram  of  the  weight  of  the  Conditioned  Scan  Lines 
(CSL's)  within  the  CIF; 

4.  Histogram  of  run  lengths  in  the  CIF;  and, 

5.  Two  histograms  of  the  weight  of  CIF  segments,  where 
each  CIF  segment  represents  a  contiguous  32-bit  or 
64-bit  section  of  the  CSL. 

Delta  Information  Systems  generated  a  Conditioned  Image  File  for 
each  test  image  at  each  resolution  in  the  input  set.  The  set  of 
twelve  CIF's  was  written  onto  a  9-track,  1600  BPI  magnetic  tape  as 
described  in  the  SOW;  DIS  will  deliver  two  copies  of  this  tape.  The 
operating  instructions  for  the  software  modules  employed  to  generate 
the  CIF's  and  their  associated  statistics  are  included  in  Appendix  B; 
the  code  listings  for  these  modules  appear  in  Appendix  C. 


3.0  SLDC  Encoder  Software  Design 


Scan  Line  Difference  Compression  is  an  encoding  method  employed 
in  the  transmission  of  binary  images.  The  SLDC  encoder  program 
compresses  the  Conditioned  Image  Files  (CIF's)  generated  in  the 
previous  subtask  by  taking  advantage  of  the  local  conditional 
statistics  of  each  image.  The  SLDC  algorithm  is  described  in  detail 
in  Appendix  A;  a  brief  description  of  the  encoding  process  is 
presented  here,  along  with  the  documentation  for  the  software  modules 
associated  with  the  encoder  program.  The  operating  instructions  for 
the  encoder  program  are  presented  in  Appendix  B;  the  code  listings 
appear  in  Appendix  C. 

3 . 1  Functional  Description 

File  encoding  is  done  one  line  at  a  time.  As  each  Conditioned 

Scan  Line  (CSL)  is  read  in,  the  Hamming  weight  of  the  line  is 

calculated.  If  the  line  weight  is  zero,  the  line  is  encoded  with  a 

single  bit,  set  to  one  (1).  If  the  line  weight  is  greater  than  zero, 

the  integer  value  of  the  line  weight  is  placed  in  the  first  thirteen 

bits  of  the  Encoded  Scan  Line  (ESL)  (the  integer  value  of  the  line 

12 

weight  does  not  exceed  2  )  and  the  csl  is  encoded  as  a  series  of 

n-length  segments,  where  n  is  the  user-defined  segment  length.  The 
codewords  for  all  possible  segment  weights  which  can  occur  on  a  line 
of  the  current  line  weight  are  determined  from  the  statistics  of  the 
line;  a  Huffman  coding  technique,  illustrated  in  Figure  3.1,  is 
employed  to  generate  the  codewords. 

The  next  step  is  to  partition  the  CSL  into  segments  of  the  length 
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Huffman  Probability  Table  Reduction  Process 
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Huffman  Codeword  Table  Expansion  Process 
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Figure  3.1 


Huffman  Codeword  Assignment  Process 


specified  by  the  operator.  Each  segment  is  encoded  with  a  two-part 
Segment  State  Word  (SSW)  that  consists  of  a  segment  weight  codeword 
and  a  segment  rank.  As  each  segment  is  extracted  from  the  CSL,  it  is 
placed  into  a  buffer  and  its  weight  is  calculated.  If  the  segment 
weight  is  zero,  the  zero-length  codeword  is  placed  into  the  SSW  and 
the  segment  rank,  which  is  an  integer  value  which  indicates  the 
positions  of  the  ones  (l's)  in  the  segment,  is  omitted. 

If  the  segment  weight  is  greater  than  zero  and  less  than  the 
maximum  weight  allowable  for  a  segment  of  that  size,  the  rank  encoding 
procedure  is  initiated.  The  segment  buffer  is  examined  from  left  to 
right  to  determine  the  rank  of  the  segment  using  the  method  described 
in  Appendix  A.  Once  the  rank  is  determined,  it  is  placed,  along  with 
the  segment  weight  codeword,  in  the  output  buffer  using  the  prescribed 
number  of  bits  generated  by  the  rank  length  equation. 


If  the  segment  weight  is  greater  than  the  maximum  allowable 
weight,  the  rank  encoding  procedure  is  not  invoked;  the  SSW  for  the 
segment  is  comprised  of  the  codeword  for  the  "exceeded  maximum" 
condition  followed  by  the  segment  in  uncompressed  form.  Maximum 
segment  weight  values  must  be  employed  in  order  to  avoid  arithmetic 
overflow  in  the  rank  encoding  computations.  The  maximum  segment 
weights  for  the  various  segment  lengths  are  presented  in  Table  3.1; 
these  values  are  slightly  different  from  those  found  in  Appendix  A 
because  the  HP-1000  simulation  system  employs  a  sign-bit  in  its  32-bit 


double  integer  word.  Therefore,  the  HP-1000  has  a  maximum  integer 
_  3 1  32 

value  ol  2  instead  of  the  value  of  2  apparently  employed  to 


construct  Table  A-l . 
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Maximum  Segment  Weight  Value 


Segment  Length 
in  Bits 


16-bit  Word  Length  32-bit  Word  Length 


l18 

20 

24 

28 

32 

36 

40 

44 

48 

52 

56 

60 

64 


unconstrained 

6 

5 

4 

4 

4 

3 

3 

3 

3 

3 

3 

3 


unconstrained 


12 

10 

9 

9 

8 

8 

7 

7 


Table  3.1  -  Maximum  Weight  Segment  for  Normal  SLDC  Processing 


For  each  segment  encoded,  the  segment  weight  code  length  and  the 
rank  length  are  added  to  the  total  number  of  bits  encoded  to  give  the 
compression  ratio  of  either  the  entire  image  or,  if  requested,  of  each 
line . 

3 . 2  Software  Documentation 

The  software  documentation  for  the  SLDC  encoder  program  is 
presented  in  this  section  and  includes  a  structure  chart, 
Nassi-Schneiderman  flow  charts  for  the  major  software  modules,  and 
descriptions  of  the  functions  and  data  storage  methods  associated  with 
the  encoder  program.  Also  included  in  this  section  is  the  data  flow 
diagram  for  the  overall  simulation  project  and  Nassi-Schneiderman  flow 
charts  for  the  software  modules  that  generate  the  Conditioned  Image 
Files?  this  additional  documentation  is  not  directly  related  to  the 
SLDC  encoder  program,  but  is  presented  here  in  order  to  convey  a 
clearer  picture  of  the  software  simulation  of  the  SLDC  compression 
algorithm. 


2 


Structure  Chart  for  Module :  ENCODE 


Figure  3.3 


ENCODE. FTN  -  SLDC  Encoding  Program 


OPEN  files  and  read  input  parameters 
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Figure  3.3  -  ENCODE. FTN  -  SLDC  Encoding  Program 


v  Is  segment  weight  =  0  or 

^\segment  weight  >  max  weight? 


NO  \ 


Encode  segment 
weight  using 
codes  from  WEIGHT 
array  and  code 
lengths  from  CODLEN 
array  generated 
by  SUBROUTINE  CODGEN 


RANK  ENCODING  PROCESS 
(see  next  page) 


S'\s^Segment  weight  *  0? 

NoX^^^ 

Encode  segment 

Encode  segment 

in  uncompressed 

weight=0  using 

form 

WEIGHT  array 

END 


Figure  3.3  -  ENCODE. FTN  -  SLDC  Encoding  Program 


RANK  Encoding  Process 

DO  for  #  bits  in  segment  OR  "l"s  seen  =  segment  weight 
OR  rest  of  segment  is  filled  with  "l“s 


I  Put  present  buffer  word  into  an  integer 


DO  for  #  bits  in  segment  OR  "l"s  seen  =  segment  weight 
OR  rest  of  segment  is  filled  with  "l"s 


j  Use  RANK  length  equation  described  in  SOW  to  determine  RANK  length 


Encode  RANK  length  number  of  bits  to  encode  rank  in  ESL 


Figure  3.4  -  CODGEN . FTN  -  SLDC  Huffman  Coding  Routine 


DO  for  Weight  =  0  to  Max  weight 


Calculate  probabilities  for  each  weight 
using  EQ  [A-4]  in  Appendix  A 


Assign  Probability  to  Maxweight  +  1 
(1  -  sum  of  probabilities) 


Bubble  sort  probabilities  in  descending  order  keeping 
{  track  of  original  positions 


CALL  HUFCOD  to  calculate  codewords  and  codelengths 
for  probabilities 


DO  for  weight  =  0  to  Maxweight  +1 


Put  corresponding  weight  value  back  into  proper  weight 
subscript  of  Weight  array  and  Code  length  array 
to  provide  easy  access  to  codes  and  code  lenghts 


I 

I  RETURN 
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Figure  3.5  -  HUFCOD.FTN  -  Huffman  Codword  Generator 


DO  for  0  to  Max  weight  of  segment 


Add  last  two  entries  in  column 

Determine  it's  rank  in  next  column 

Place  it  in  the  next  column  at  it's 
proper  rank 

Mark  the  position  in  the  position 
array 


Initialize  codes  and  indicies 


DO  for  Max  weight  down  to  0 


Add  a  bit  to  codeword  at  next  column's 
position  according  to  position  array 

Set  new  bit  of  last  entry  in  column  to  "0" 

Set  new  bit  of  last  entry  in  new  column  to  "1" 


RETURN 


PROGRAM: 

DESCRIPTION: 


RUNSTRING: 

INPUT  NAME 
OUTPUT  NAME 
ORDER  OF 

INPUT  PARAMETERS 


MODULES  CALLED: 
CODGEN 

BINOM 

MI2B 

MB4B 


Program  Documentation  for  module :  ENCODE 


ENCODE 


This  program  encodes  an  input  Conditioned 
Image  File  using  SLDC  encoding  techniques. 

The  program  interactively  inquires  for,  then 
accepts  input  parameters  used  in  runs  having 
different  file  sizes  and  design  parameter  sets. 

A  summary  of  each  run  is  printed  including: 
names  and  sizes  of  files  used  and  compression 
statistics.  Options  to  create  an  output  file 
(and/or)  print  line  by  line  compression  statistics 
are  also  included. 


ENCODE, <INPUT  NAME> , < OUTPUT  NAME> 

Input  Image  File  name 

Output  Encoded  File  name . (Optional ) 


1 )  #  Records  to  be  output 

2)  Decision  to  create  output  file 

3)  Maximum  weight  for  segment  length 

4)  CCITT  file  number 

5)  File  resolution 

6)  Arithmetic  word  length 

7)  #  Words  per  input  record 

8)  #  Records  per  input  file 

9)  Segment  length  for  compression 

10)  Decision  to  print  line  by  line  compression 
statistics 


Huffman  code  generation  subroutine.  Generates 
Huffman  codes  for  each  line  in  input  file 
with  line  weight  greater  than  zero. 

Binomial  coefficient  function 

Subroutine  to  place  an  integer  into  a  given 
position  in  an  array 

Subroutine  to  move  parts  of  one  array  into  parts 
of  another  array. 
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Program  Documentation  for  module:  ENCODE 


MVBITS 

BTEST , IBSET  FORTRAN  bit  manupulating  routines 


NAMED  COMMON 
DESCRIPTIONS: 

Block  Name:  ENCOD 

Module  Common  to:  CODGEN 

Descriptions : 

WEIGHT  Array  of  Huffman  codes  returned 

from  CODGEN  with  array  subscripts 
denoting  segment  weights 

SEGL  Segment  length  design  parameter 

used  in  EQ.  [A-4]. 

NUMBER  Number  of  bits  per  CSL  to  be  used 

in  EQ.  [ A-4 3 . 

CODLEN  Code  length  array  returned  from 

CODGEN  with  values  corresponding 
to  those  of  WEIGHT  array 


3 
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Subroutine  Documentation  for  module:  CODGEN 


SUBROUTINE: 

MODULES 
CALLED  FROM: 

PURPOSE: 


MODULES  CALLED: 
BINOM 
HUFCOD 

* 

CALLING  FORMAT: 

ARGUMENT 

DESCRIPTIONS: 

LNWGT 

MAXWGT 

NAMED  COMMON 
DESCRIPTIONS: 


CODGEN 


ENCODE  and  DECODE 


This  subroutine  generates  probabilities  for 
each  possible  weight  from  0  upto  and  including 
the  maximum  weight  allowable  for  a  given  segment+1 
passed  by  parameter.  These  probabilities  are  then 
used  by  SUBROUTINE  HUFCOD  to  generate  codes  and  code 
lenghts  for  these  probabilities. 


Binomial  coefficient  function 

Subroutine  which  assigns  variable  length  Huffman 
codes  and  code  lengths  to  segment  weights  which  are 
arranged  in  most  probable  to  least  probable  order. 


CALL  CODGEN ( LNWGT , MAXWGT ) 


Current  CSL  weight  generated  by  ENCODE. 

LNWGT  is  used  by  EQ.  [A-4]. 

Maximum  segment  weight  encodable  for  design 
parameter  segment  length  taken  from  TABLE  [A-lJ 
MAXWGT  is  also  used  by  EQ.  [A-4]. 


Block  Name:  ENCOD 

Module  Common  to:  ENCODE 


Descriptions : 

WEIGHT  Array  of  Huffman  codes  returning 

to  ENCODE  with  array  subscripts 
denoting  segment  weights 
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Subroutine  Documentation  for  module:  CODGEN 


SEGL 

Segment  length  design  parameter 
used  in  EQ.  [A-4]. 

NUMB PL 

Number  of  bits  per  CSL  to  be  used 
in  EQ.  [A-4]. 

CODLEN 

Code  length  array  returning  to 
ENCODE  with  values  corresponding 
to  those  of  WEIGHT  array 

Block  Name : 

HUFBLK 

Module  Common  to: 

HUFCOD 

Descriptions : 

PROB 

Probability  array  containing 
probabilities  of  segment  weights 
of  a  given  CSL. 

CODE 

Array  returned  holding  huffman  code 
in  most  probable  to  least  probable 
order 

MAXWT 

Maximum  weight  of  a  segment, 
(described  above) 

CDWLEN 

Code  length  array  matching  values 
in  CODE  array. 
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Subroutine  Documentation  for  module:  HUFCOD. FTN 


SUBROUTINE:  HUFCOD 

MODULE 

CALLED  FROM :  CODGEN 


PURPOSE:  This  subroutine  takes  probabilities  generated  by 

CODGEN  and  determines  the  positions  for  a  right 
to  left  expansion  of  the  huffman  coding  table 
according  to  the  values  of  the  probabilities. 


CALLING  FORMAT:  CALL  HUFCOD 


NAMED  COMMON 
DESCRIPTIONS: 


Block  Name :  HUFBLK 

Module  Common  to:  CODGEN 


Descriptions : 
PROB 

CODE 


Probability  array  containing 
probabilities  of  segment  weights 
of  a  given  CSL. 

Array  returning  huffman  codes 

in  most  probable  to  least  probable 

order 


WGTNUM 


Maximum  weight  of  a  segment, 
(described  above) 


CDWLEN 


Code  length  array  matching  values 
in  CODE  array. 


FUNCTION: 


MODULES 
CALLED  FROM: 


PURPOSE: 


CALLING  FORMAT: 


ARGUMENT 

DESCRIPTIONS: 


Subroutine  Documentation  for  module:  BINOM.FTN 


BINOM 


CODGEN , ENCODE , DECODE 


This  function  uses  an  algorithm  found  in  CACM 
to  calculate  the  binomial  coefficient  given 
two  parameters.  The  maximum  value  returned  is 
231  which  is  the  maximum  value  representable 
in  a  FORTRAN  DOUBLE  INTEGER. 


n  —  BINOM(N,M)  where  n  is  a  single  or  double  integer 


N,M 


N  objects  taken  M  at  a  time 


XMB4BE . LIB : s DAF4LIB 
FORTRAN  77  Subroutines 
Author:  S.  J.  Urban 

8/5/85 


DIS  FORTRAN  77  bit  handling  routines 


Provides  facilities  for  storing  integers,  in 
packed  form,  in  variables  and  arrays,  and  for 
retrieving  them  at  a  later  time. 

CALL  MI2B( 12,  BA,  JB,  NB ) 

Stores  the  INTEGERM  12  into  BA,  starting 
at  the  JBth  bit ,  and  occupying  NB  bits . 


I4B ( BA,  JB,  NB) 

Retrieves  (as  its  functional  value)  the  INTEGER*4 
integer  stored  in  BA,  starting  at  the  JBth  bit, 
and  occupying  NB  bits.  DI4B  returns  a  INTEGER*4 
integer  value  and  must  be  declared  as  such  in 
the  calling  routine. 


CALL  MB4B(TBA,  JTB,  NB,  FBA,  JFB) 

Replaces  JTBth  through  the  (JTBth  +  NB  -  l)st 
bits  of  TBA  with  the  JFBth  through  the 
(JFB  +  NB  -  l)st  bits  of  FBA. 


ARGUMENTS ;  12  -s  INTEGER*4  variable  or  array  element 

JTB, JFB, JB  -  INTEGER*2  starting  bit  position 
for  string;  must  be  >  0. 

NB  -  INTEGER*2  no.  of  bits  in  a  string  (i.e. 
string  size  must  be  >  0. 

TBA, FBA, BA  -  INTEGER*2  or  INTEGER*4  arrays  used  for 

storing  "packed"  bit  strings. 


FUNCTION : 


CALLS: 


Input  Image  File 


Prediction  statistics 


Conditioned  Image  File 
&  file  statistics 


Encoded  Image  File 


Conditioned  Image  File 


Original  Image  File 


Figure  3.6 


SLDC  Data  Flow  Diagram 


OPEN  files  and  read  input  parameters 


Initialize  all  count  arrays  and  buffers 


DO  for  #  records  in  file 


] - - - - 

'  DO  for  #  words  in  input  record 


I 

I 


I 

i 


DO  for  #  bits  per  word 


Test  for  value  of  current  bit  being 
examined 


Test  for  neighboring  bit  values 


Search  template  array  for  current 
neighbor  template 


Increment  count  array  for  total, 
and  black  white  count  of  current 
neighbor  template 


Print  all  statistics 


Figure  3.7  -  CHART. FTN  -  SLDC  Probability  Statistics  Generator 


Figure  3^.8  -  CREATE .  FTN  -  SLDC  CIF  Generating  Program 


OPEN  files  and  read  input  parameters 


READ  Probability  Data  Generated  from  PROGRAM  CHART 


Initialize  buffers  and  count  arrays 


DO  for  #  records  in  file 


DO  for  #  words  in  input  record 


DO  for  #  bits  per  word 


Test  for  neighboring  bit  values 


I  Search  probability  data  for  proper 


neighbor  template  and  respective 
prediction 


Set  current  bit  to  "1" 


Increment  iineweight  count,  32  bit  segment  weight 
count,  and  64  bit  segment  weight  count 


Figure  3.8  -  CREATE. FTN  -  SLDC  CIF  Generating  Program 

i 

!  - Reach  end  of  32  bit  segment?- 

j  YES  “  '  - - - 


Increment  proper  32  bit  weight  subscript  in  32  bit 
count  array 


j  Increment  proper  64  bit  weight  subscript  in  64  bit 
{  count  array 


j  Put  present  output  word  in  output  buffer 


Increment  proper  line  weight  subscript  in  line  weight 
count  array 


Add  line  weight  to  file  weight 


WRITE  output  buffer  to  output  file 

r .  — ■  ...  -  - .  -  — . . . . 

L  - - 

Print  out  all  statistics 


CLOSE  files 


END 
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Figure  3.9  -  RESTORE . FTN  -  SLDC  Original  Restoration  Program 


OPEN  files  and  read  input  parameters 


READ  Probability  Data  Generated  from  PROGRAM  CHART 


Initialize  buffers  and  count  arrays 


DO  for  #  records  in  file 


Is  current  bit  1  ? 


Place  bit  opposite  of  Place  prediction  bit  in 

t 

prediction  bit  in  current  bit  position 

i 

current  bit  position  | 


Figure  3.9 


RESTORE . FTN 


SLDC  Original  Restoration  Program 


Put  present  output  word  in  output  buffer 


WRITE  output  buffer  to  output  file 


4.0  SLDC  Decoder  Software  Design 

The  SLDC  decoder  program  reconstructs  the  Conditioned  Image  Files 
(CIF's)  compressed  in  the  previous  subtask  by  employing  the  identical 
codeword  generator  and  rank  encoding  procedure  employed  in  the  SLDC 
encoder  program.  A  brief  description  of  the  decoding  process  is 
presented  here,  along  with  the  documentation  for  the  software  modules 
associated  with  the  decoder  program.  The  operating  instructions  for 
the  decoder  program  are  presented  in  Appendix  B;  the  code  listings 
appear  in  Appendix  C. 

4 . 1  Functional  Description 

Decoding  is  also  done  one  line  at  a  time.  Each  ESL  is  read  into 
a  buffer  and  the  first  bit  in  the  line  (bit  15  in  word  1)  is  checked. 
If  the  bit  is  set  to  one  (1),  the  line  weight  is  zero  and  all  bits  in 
the  decoded  CSL  are  set  to  zero  (0).  If  the  bit  is  not  set  to  one 
(1),  the  first  13  bits  of  the  ESL  are  interpreted  as  the  line  weight 
and  the  Huffman  coding  subroutine  is  called  to  generate  Huffman  codes 
for  all  possible  segment  weights  for  that  particular  line  weight.  The 
program  proceeds  by  decoding  one  segment  at  a  time;  a  number  of  bits 
equal  to  the  maximum  segment  weight  codeword  are  read  from  the  ESL 
into  a  buffer.  The  segment  weight  is  then  decoded  by  comparing  each 
code  in  the  Huffman  code  array  with  the  same  number  of  bits  in  the 
codeword  buffer  until  the  code  is  found. 

If  the  segment  weight  is  found  to  be  zero,  the  next  n  bits  in  the 
decoded  CSL,  where  n  is  the  segment  length,  are  set  to  zero  (0).  If 
the  segment  weight  is  found  to  be  greater  than  the  maximum  allowable 


4 


1 


weight,  the  segment  has  been  "transmitted"  uncompressed,  and  the  next 
n  bits  in  the  ESL  are  placed  directly  into  the  decoded  CSL,  where, 
again,  n  is  the  segment  length. 

For  the  remaining  condition,  in  which  the  segment  weight  is 
greater  than  zero  and  less  than  the  maximum  allowable,  the  length  of 
the  rank  can  be  calculated  by  using  the  segment  weight  in  the  rank 
length  equation.  The  rank  is  then  determined  from  the  integer  value 
of  the  next  r  bits  in  the  ESL,  where  r  is  the  determined  rank  length. 
The  segment  is  reconstructed  by  placing  a  one  (1)  in  the  output 
segment  buffer  whenever  the  difference  between  the  rank  and  the 
calculated  binomial  coefficient  (see  Appendix  A)  is  positive;  when  a 
one  (1)  is  found,  the  rank  is  decremented  by  an  amount  equal  to  the 
coefficient.  This  rank  decoding  process  continues  until  the  entire 
rank  is  examined,  or  the  rank  is  zero  after  the  decrement  takes  place. 
The  output  segment  buffer  is  then  placed  in  the  decoded  CSL.  The 
segment  decoding  continues  until  all  segments  in  the  line  have  been 
decoded,  or  until  the  sum  of  the  segment  weights  of  the  segments 
placed  in  the  decoded  CSL  is  equal  to  the  line  weight. 

4 . 2  Software  Documentation 

The  software  documentation  for  the  SLDC  decoder  program  is 
presented  in  this  section  and  includes  a  structure  chart,  a 
Nass i-Schneiderman  flow  chart,  and  a  description  of  the  functions  and 
data  storage  methods  associated  with  the  main  software  module,  DECODE. 
All  other  modules  associated  with  the  decoder  program  are  identical  to 
those  of  the  encoder  program;  the  documentation  for  these  modules 
appears  in  section  3.2. 
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Figure  4.1  -  DECODE. FTN  -  SLDC  Decoding  Program 


OPEN  files  and  read  input  parameters 


DO  for  #  records  in  file 


CALL  CODGEN  with  lineweight  and  max  weight  of  segment 


DO  WHILE  #  of  "l"s  put  in  output  line  <  line  weight 


READ  in  code  buffer  (double  integer  from  ESL ) 


DO  WHILE  segment  weight  code  is  not  found 


Check  current  code  in  array  with  same 


number  of  bits  in  code  buffer 


Figure  4.1  -  DECODE . FTN  -  SLDC  Decoding  Program 


i 


Plug  line  weight  into  Place  uncoded  segment 

RANK  length  equation  in  output  buffer 

to  get  rank  length 


Use  that  many  bits  of 
ESL  and  place  decoded 
rank  in  output  buffer 


END 
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Program  Documentation  for  module:  DECODE 

PROGRAM: 

DECODE 

DESCRIPTION: 

This  program  decodes  an  input  Encodes 

Image  File  coded  using  SLDC  encoding  techniques. 
The  program  interactively  inquires  for,  then 
accepts  input  parameters  used  in  runs  having 
different  file  sizes  and  design  parameter  sets. 

A  summary  of  each  run  is  printed  including 

names  and  sizes  of  files  used  and  design  parameters 

RUNSTRING: 

DECODE, <INPUT  NAME> , < OUTPUT  NAME> 

INPUT  NAME 

Input  Image  File  name 

OUTPUT  NAME 

Output  Conditioned  Image  File 

ORDER  OF 

INPUT  PARAMETERS: 

' 

1)  #  Words  per  output  record 

2)  #  Records  to  be  output 

3)  Arithmetic  word  length 

4)  #  Words  per  input  record 

5)  #  Records  per  input  file 

6)  Segment  length  for  compression 

7 )  Maximum  weight  per  segment 

MODULES  CALLED: 

CODGEN 

Huffman  code  generation  subroutine.  Generates 
Huffman  codes  for  each  line  in  input  file 
with  line  weight  greater  than  zero. 

BINOM 

Binomial  coefficient  function 

I4B 

Functionqto  extract  an  integer  from  a  given 
position  in  an  array 

MB4B 

Subroutine  to  move  parts  of  one  array  into  parts 
of  another  array. 

MVBITS , IBITS 
BTEST , IBSET 

FORTRAN  bit  manupulating  routines 

NAMED  COMMON 
DESCRIPTIONS: 

Block  Name :  ENCOD 
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Program  Documentation  for  module :  DECODE 


Module  Common  to: 

CODGEN 

Descriptions : 

WEIGHT 

Array  of  Huffman  codes  returned 
from  CODGEN  with  array  subscripts 
denoting  segment  weights 

SEGL 

Segment  length  design  parameter 
used  in  EQ.  [A-4]. 

NUMBER 

Number  of  bits  per  CSL  to  be  used 
in  EQ.  [A-4]. 

CODLEN 

Code  length  array  returned  from 
CODGEN  with  values  corresponding 
to  those  of  WEIGHT  array 
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5.0  SLDC  Compression  Results  and  Analysis 


Simulation  runs  made  on  each  image  using  the  SLDC  technique 
included  design  parameters  of  both  16  and  32  bit  arithmetic  word 
lengths  using  segment  lengths  from  8  to  64,  in  increments  of  8.  The 
parameters  were  chosen  to  experiment  with  many  combinations  of  word 
lengths  and  segment  lengths  to  determine  the  most  efficient  parameter 
set  for  each  image  at  each  resolution. 

The  results  from  the  SLDC  compression  simulation  runs  showed  the 
segment  lengths  using  the  32  bit  arithmetic  word  to  have  consistently 
higher  compression  on  each  image  tested  at  all  four  resolutions, 
except  for  segment  lengths  of  8  and  16,  where  compression  is  equal  due 
to  the  design  of  the  algorithm.  The  higher  compression  in  the  32  bit 
word  simulation  runs  is  related  to  the  maximum  weight  constraint 
difference  between  the  16  and  32  bit  word  lengths,  where  compression 
of  the  segment  to  be  encoded  is  bypassed  if  the  weight  of  the  segment 
exceeds  the  maximum  weight  constraint.  The  compression  bypass 
involves  skipping  the  rank  encoding  part  of  .the  algorithm  when  the 
segment  weight  is  higher  than  the  maximum  weight  allowable  for  that 
segment.  Another  direct  result  of  the  compression  bypass  is  a 
difference  in  processing  time,  which  is  consistently  lower  in  the  16 
bit  word  simulation  runs  due  to  the  processing  skipped  when  the  rank 
encoding  is  omitted.  The  processing  times  of  all  of  the  simulation 
runs  appear  to  be  directly  related  to  the  maximum  weight  constraint 
and  almost  independent  of  segment  length,  which  illustrates  which 
parameter  determines  the  difference  in  the  complexity  of  the  algorithm 


between  simulation  runs. 


The  comparison  of  the  overall  compression  between  images  was  as 
expected,  with  the  English  letter  compressing  the  best,  followed  by 
the  French  journal,  and  then  the  Kanji  text;  the  compression 
statistics  are  presented  in  Tables  5.1  through  5.12,  and  are 
illustrated  graphically  in  Figures  5.1  through  5.12  (In  all  of  the 
compression  graphs,  the  □  indicates  results  of  the  16  bit  word  length 
runs  and  the  +  indicates  the  results  of  the  32  bit  word  length  runs.). 
The  French  and  English  document  compression  curves  are  quite  similar, 
with  maximum  compression  of  both  16  and  32  bit  word  length  runs 
occurring  at  the  same  segment  lengths.  It  appears  that  the 
similarities  between  the  two  compression  curves  are  a  result  of  the 
similar  probability  distributions  of  the  two  files  along  with  similar 
CIF  constructions.  The  Kanji  compressign  curve  is  different  than 
those  of  the  French  and  English  documents.  The  Kanji  32  bit  word 
curve  is  without  the  "peaking"  effect  demonstrated  by  the  French  and 
English  curves.  The  maximum  compression  point  is  virtually 
indistinguishable  in  the  24  to  56  bit  segment  range,  as  compared  to 
the  definite  peaks  in  both  the  English, and  French  curves. 

The  maximum  compression  point  varies  with  resolution.  As 
resolution  increases,  the  segment  length  of  the  maximum  point  of  the 
16  and  32  bit  word  length  run  curves  increases.  This  can  be  described 
as  a  "right  shift"  of  the  peaks  in  the  graphs.  The  shift  is  due  to 
more  segments  with  lower  segment  weights  as  the  resolution  increases; 
therefore,  fewer  segments  exceed  the  maximum  weights  at  higher  segment 
lengths,  and  this  leads  to  optimum  compression  at  higher  segment 
lengths.  The  suggested  set  of  parameters  to  achieve  maximum 
compression,  taking  processing  time  into  consideration,  is  presented 


in  Table  5.13. 


ARITHMETIC  WORD 
LENGTH 


SEGMENT 

LENGTH 


COMPRESSION 


RUN  TIME 


E 

E 

P 


SEGMENT  LENGTH  vs.  COMPRESSION 


SEGMENT  LENGTH 


ARITHMETIC  WORD 
LENGTH 

SEGMENT 

LENGTH 

COMPRESSION 

RUN  TIME 

16 

8 

4.88 

22.19 

16 

16 

5.41 

24.32 

16 

24 

5.33 

19.31 

16 

32 

5.00 

19.00 

16 

40 

4.06 

18.39 

16 

48 

3.68 

17.56 

16 

56 

3.40 

17.22 

16 

64 

3.15 

16.49 

32 

24 

5.41 

31.22 

32 

32 

5.44 

37.51 

32 

40 

5.44 

22.17 

32 

48 

5.41 

20.41 

32 

56 

5.37 

19.12 

32 

64 

4.99 

18.41 

Table  5 . 3_  -  Compression  Stats ,  Kan  ji  Text  -  200  LPI 
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ARITHMETIC  WORD 

LENGTH 

SEGMENT 

LENGTH 

COMPRESSION 

RUN  TIME 

16 

8 

17.08 

13.04 

16 

16 

20.88 

14.06 

16 

24 

21.97 

10.56 

16 

32 

20.69 

9.56 

16 

40 

18.07 

9.46 

16 

48 

16.84 

9.14 

16 

56 

15.93 

9.21 

16 

64 

15.19 

9.06 

32 

24 

22.30 

17.02 

32 

32 

22.87 

20.10 

32 

40 

23.16 

12.25 

32 

48 

22.93 

11.57 

32 

56 

22.33 

11.28 

32 

64 

20.55 

10.44 

Table  5.4 


Compression  Stats ,  English  Letter  -  240  LPI 


SEGMENT  LENGTH  vs.  COMPRESSION 


SEGMENT  LENGTH 


ARITHMETIC  WORD 
LENGTH 


SEGMENT 

LENGTH 


COMPRESSION 


RUN  TIME 


SEGMENT  LENGTH  vs.  COMPRE 


SEGMENT  LENGTH 


SEGMENT  LENGTH  vs.  COMPRESSION 


W  CD  CO  CD 
CD  CD  lT>  ir> 
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SEGMENT  LENGTH 


ARITHMETIC  WORD 
LENGTH 

SEGMENT 

LENGTH 

COMRPESSION 

RUN  TIME 

16 

8 

19.43 

18.82 

16 

16 

23.91 

20.14 

16 

24 

25.86 

16.12 

16 

32 

24.73 

15.03 

16 

40 

21.32 

14.32 

16 

48 

19.97 

14.08 

16 

56 

18.83 

14.18 

16 

64 

17.60 

13.58 

32 

24 

26.08 

23.28 

32 

32 

26.86 

27.20 

32 

40 

27.35 

18.07 

32 

48 

27.43 

17.12 

32 

56 

27.17 

16.53 

32 


64 


25.24 


16.03 


SEGMENT  LENGTH  vs.  COMPRESSION 


SEGMENT  LENGTH 


ARITHMETIC  WORD 

SEGMENT 

COMPRESSION 

RUN  TIME 

LENGTH 

LENGTH 

16 

8 

9.58 

35.42 

16 

16 

12.78 

34.57 

16 

24 

13.91 

32.49 

16 

32 

13.51 

27.21 

16 

40 

11.89 

26.37 

16 

48 

11.21 

25.46 

16 

56 

10.64 

25.31 

16 

64 

10.10 

24.45 

32 

24 

14.04 

41.18 

32 

32 

14.67 

45.25 

32 

40 

15.02 

38.37 

32 

48 

15.10 

37.41 

32 

56 

14.93 

36.27 

32 

64 

13.94 

35.34 

Table  5.8 

-  Compression  Stats, 

French  Journal  - 

300  LPI 

SEGMENT  LENGTH  vs.  COMPRESSION 


SEGMENT  LENGTH 


ARITHMETIC  WORD 
LENGTH 


SEGMENT 

LENGTH 


COMPRESSION 


RUN  TIME 


I 

6 


SEGMENT  LENGTH  vs.  COMPRESSION 


SEGMENT  LENGTH 


ARITHMETIC  WORD 

SEGMENT 

COMPRESSION 

RUN  TIME 

LENGTH 

LENGTH 

8 

19.63 

35.39 

16 

26.88 

35.05 

24 

29.95 

29.11 

32 

30.04 

27.24 

40 

26.40 

26.37 

48 

24.63 

25.40 

56 

23.24 

26.00 

64 

21.88 

25.23 

24 

30.04 

39.48 

32 

31.71 

45.02 

40 

32.66 

31.15 

48 

33.11 

30.01 

56 

33.45 

29.52 

64 

31.81 

28.31 

Table  5.10  -  Compression  Stats.  English  Letter  -  400  LPI 


MENT  LENGTH  vs.  COMPRESSION 


SEGMENT  LENGTH 


I 


ARITHMETIC  WORD 
LENGTH 

SEGMENT 

LENGTH 

COMPRESSION 

RUN  TIME 

16 

8 

10.40 

54.47 

16 

16 

14.67 

53.14 

16 

24 

16.62 

48.23 

16 

32 

16.67 

40.23 

16 

40 

14.75 

37.45 

16 

48 

13.94 

36.27 

16 

56 

13.25 

36.40 

16 

64 

12.45 

35.25 

32 

24 

16.68 

1:01.34 

32 

32 

17.65 

1:08.29 

32 

40 

18.31 

45.50 

32 

48 

18.62 

43.57 

32 

56 

18.81 

42.28 

32 

64 

16.89 

40.01 

Table  £.!_!  -  Compression  Stats ,  French  Journal  -  400  LPI 


SEGMENT  LENGTH  vs.  COMPRESSION 


SEGMENT  LENGTH 


ARITHMETIC  WORD 
LENGTH 

SEGMENT 

LENGTH 

COMPRESSION 

RUN  TIME 

16 

8 

6.71 

1:09:46 

16 

16 

8.63 

1:06.20 

16 

24 

9.26 

55.22 

16 

32 

8.89 

49.39 

16 

40 

7.58 

47.53 

16 

48 

7.00 

46.03 

16 

56 

6.59 

47.24 

16 

64 

6.05 

45.11 

32 

24 

9.32 

1:16.32 

32 

32 

9.39 

1:26.23 

32 

40 

9.32 

58.29 

32 

48 

9.28 

55.56 

32 

56 

9.40 

56.01 

32 

64 

9.12 

53.27 

Table  5^ .  12  -  Compression  Stc.ts ,  Kan  ji  Text  -  400  LPI 
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SEGMENT  LENGTH  vs.  COMPRESSION 


SEGMENT  LENGTH 


Table  5.13 


Suggested  Parameters  for  SL DC  Encoding  Program 


Suggested  Segment  Lengths  for  Optimum  SLDC 


16  bit  Arithmetic  32 

File  word  length 


English 

200 

lpi 

24 

French 

200 

lpi 

24 

Kanji 

200 

lpi 

16 

English 

240 

lpi 

24 

French 

240 

lpi 

24 

Kanji 

240 

lpi 

16 

English 

300 

lpi 

24 

French 

300 

lpi 

24 

Kanji 

300 

lpi 

24 

32 

32 


Compression 

bit  Arithmetic 
word  length 


40 

40 

40 

40 

40 

40 

48 

48 

56 

56 

56 


English  400  lpi 
French  400  lpi 
Kanji  400  lpi 


24 


56 


An  analysis  of  the  32  and  64  bit  histograms  described  in  the  SOW 
versus  those  expected  on  the  basis  of  a  memoryless  binomial 
distribution  was  made  in  order  to  determine  if  the  deviation  between 
the  actual  and  expected  probability  distributions  was  significant. 

The  results  of  this  analysis  are  presented  graphically  in  Figures  5.13 
through  5.33  (In  the  probability  distribution  graphs,  the  □  indicates 
the  expected  distribution  curve  and  the  +  indicates  the  actual 
distribution  curve.).  Every  graph  but  the  200  resolution  Kanji  and  64 
bit  400  resolution  Kanji  graphs  have  a  second  graph  associated  with  it 
in  which  the  deviation  between  the  curves  is  presented  in  greater 
detail.  The  results  show  definite  significant  deviation  between  the 
actual  and  expected  histograms,  and  is  most  evident  in  the  Kanji 
graphs.  It  appears  that  the  number  of  occurrences  in  both  32  and  64 
bit  cases  of  segment  weights  <  3  are  significantly  lower  than 
expected,  and  occurrences  of  segment  weights  >  3  are  significantly 
higher  than  expected.  Attempts  were  made  to  alleviate  this  problem  by 
trying  to  produce  a  more  predictable  CIF  by  increasing  the  neighbor 
template  from  4  to  7  bits.  The  three  extra  bits  were  placed  in 
different  positions  available  to  the  decoder,  around  the  already 
existing  template.  This  attempt  made  the  actual  curve  slightly  more 
binomial  in  nature,  but  changes  in  compression  were  small. 

There  are  two  possible  ways  to  improve  the  SLDC  technique.  The 
first  is  the  aforementioned  attempt  to  improve  the  CIF  to  be  more 
binomially  predictable,  which  was  not  successful.  The  second  would  be 
to  replace  the  predictive  encoding  algorithm,  which  is  binomially 
based,  with  a  function  which  better  fits  the  actual  statistics  of  the 
CIF's. 
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ACTUAL  vs.  EXPECTED  HISTOGRAM 
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Figure  5.17  -  Full  Scale 
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Figure  5.19  -  Full  Scale 
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ACTUAL  vs.  EXPECTED  HISTOGRAM 
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ACTUAL  vs.  EXPECTED  HISTOGRAM 
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ACTUAL  vs.  EXPECTED  HISTOGRAM 
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Figure  5.26  -  Enlarged  Section 
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ACTUAL  vs.  EXPECTED  HISTOGRAM 
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ACTUAL  vs.  EXPECTED  HISTOGRAM 
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ACTUAL  vs.  EXPECTED  HISTOGRAM 
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ACTUAL  vs.  EXPECTED  HISTOGRAM 


SEGMENT  WEIGHT 


o  o 

fr-  AS 


i 

o 


1 — i — i — i — i — i — r 

oooooooooooooo 

"tnMHOCbtONDiftrtnMH 


(spue  s  tioyx) 
SaONaHHQDDO 


o 


Figure  5.33  -  Full  Scale 


6.0  Conclusions  and  Recommendations 


In  simulation  runs  of  the  SLDC  algorithm,  the  overall  compression 
achieved  by  compressing  OIF's  using  the  SLDC  encoding  algorithm  did 
not  compare  favorably  with  the  higher  compression  achieved  by  the 
MODREAD  II  algorithm  run  on  the  same  original  images.  A  side-by-side 
comparison  of  the  results  of  the  two  images  is  presented  in  Table  6.1; 
the  large  difference  in  compression  between  the  two  algorithms  is  made 
more  evident  when  displayed  graphically  (Figures  6.1  through  6.4;  in 
the  comparison  graphs,  the  □  indicates  the  SLDC  compression  results 
and  the  +  indicates  the  MODREAD  II  compression  results.). 

Compression  differences  between  individual  images  show  the  Kanji 
documents  to  have  the  closest  compression  to  MODREAD  II,  followed  by 
the  English,  then  French  documents.  The  difference  in  compression 
between  the  two  algorithms  increases  as  the  resolution  increases.  The 
compression  vs.  resolution  curves  of  both  algorithms  are  linear,  with 
slope  differences  varying  by  a  factor  of  two  on  the  average,  with 
MODREAD  II  having  the  steeper,  more  positive  slope. 

A  possible  suggestion  to  reach  better  compression,  closer  to 
MODREAD  II,  would  be  a  better  predicting  function,  as  mentioned 
earlier,  to  better  match  the  actual  statistics  of  the  CIF's.  Another 
suggestion  to  consider  is  a  new  CIF  prediction  scheme  which  would 
generate  CIF  statistics  that  would  more  closely  match  those  of  a 
binomial  distribution. 
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CCITT 

IMAGE 

Resolution , 
LPI 

SLDC 

Compression 

Ratio 

MODREAD  II 
Compression 
Ratio 

% 

Diff 

*1 

200 

*  20.17 

30.57 

34.0 

English 

240 

23.16 

36.54 

36.6 

Letter 

300 

27.43 

45.44 

39.6 

400 

33.45 

59.57 

43.8 

#5 

200 

11.61 

17.61 

34.1 

French 

240 

12.90 

21.00 

38.6 

Journal 

300 

15.10 

25.91 

41.7 

400 

18.81 

34.55 

45.5 

#7 

200 

5.44 

7.59 

28.3 

KANJI 

240 

6.23 

9.12 

31.7 

TEXT 

300 

7.34 

11.43 

35.8 

400 

9.40 

15.50 

39.4 

200 

9.39 

13.56 

30.8 

AVERAGE 

240 

10.67 

16.25 

34.3 

300 

12.56 

20.26 

38.0 

400 

. 

15.84 

27.22 

41.8 

Table  6.1  -  SLDC  vs.  MODREAD  II  CoiriDarison 


RESOLUTION 


APPENDIX  A  —  DESCRIPTION  OF  SCAN  LINE  DIFFERENCE  COMPRESSION  ALGORITHM 


A-l.  INTRODUCTION 


This  Appendix  describes  a  Scan  Line  Difference  Compression  (SLDC) 
algorithm  which  is  hypothesized  to  be  an  efficient  distortionless  data 
compression  algorithm  as  well  as  a  technique  which  can  easily  be  implemented 
using  existing  microcomputer  and  LSI  technology.  An  experimental  test  of  this 
technique  Is  about  to  be  Initiated.  Until  these  experimental  results  are 
available,  it  shall  be  necessary  to  rely  on  "arm-waving"  arguments  to  explain 
why  the  SLDC  algorithm  is  considered  to  be  a  viable  candidate  for  facsimile 
data  compression.  These  arguments  are  given  In  Section  A-3. 

A-2.  DESCRIPTION  OF  SLDC  ALGORITHM 


Figure  A-l.  is  a  diagram  of  the  overall  image  transmission  system.  As 
shown  on  this  figure,  the  overall  system  is  partitioned  into: 

1.  the  source  coding  subsystem,  concerned  with  the  minimization  of  the 
number  of  bits  necessary  to  permit  the  distortionless  reconstruction 
of  an  image,  and, 

2.  the  channel  coding  subsystem,  which  attempt  to  correct  whatever 
errors  may  be  introduced  during  the  signal  transmission  process. 

- SLDC  Encoder - 

I  Image  j  |  Conditioned!  | Conditioned!  |  Error  Control  | 

|  Source  | - |  Scan  Line  | - 1  Scan  Line  | - j  Encoder  &  j 

I  III  Generator  |  |  |  Encoder  |  |  j  Transmitter  | 

!  |  | 

I  |  Compressed 

Sequential  Conditioned  Conditioned  +  —Noise 

Scan  Lines  Scan  Lines  Scan  Lines 

I  Image  |  |  |  Scan  Line  |  |  | Conditioned!  |  |  Receiver  and  | 

|  Sink  | - 1  Recon-  | - 1  Scan  Line  | - j  Error  Control  | 

I _ I  Instruction  |  _  |  Decoder  |  |  Decoder  | 


- SLDC  Decoder - 

Figure  A-l.  --  Overall  Data  Compression  Process 

This  Appendix  deals  exclusively  with  the  source  coding  aspects  and  presumes 
that  the  transmission  process  is  noiseless  either  due  to  the  inherent  absense 
of  noise  or  an  effective  channel  coding  subsystem  providing  the  necessary 
error  control. 


The  SIDC  system  consists  of  an  encoder,  located  at  the  Image  generation 
source,  and  a  decoder,  located  at  the  point  where  the  Image  Is  regenerated. 

The  encoder  and  decoder  represents  a  transform  pair  in  which  the  decoder 
performs  the  inverse  process  to  that  performed  by  the  encoder.  For  most  of 
the  remaining  discussion,  the  SLDC  algorithm  shall  be  described  in  terms  of 
the  encoding  functions  only,  since  that  effectively  defines  the  entire  process. 

It  is  assumed  that  the  image  to  be  compressed  can  be  represented  as  a 
two  dimensional  rectangular  binary  matrix  composed  of  “r"  rows  and  "c" 
columns.  Every  matrix  element  is  either  white  (0)  or  black  (1).  The  raw 
image  can  therefore  be  expressed  with  a  binary  sequence  of  rc  bits  in  length. 
The  image  is  output  from  the  source  as  r  successive  scan  lines,  where  each 
scan  line  consists  of  a  string  of  c  bits. 

If  the  process  generating  the  image  were  memoryless  (i.e.,  future  Image 
elements  were  independent  of  all  past  image  elements)  and  if  the  probability 
of  a  "1“  or  a  "0"  at  each  element  were  0.5,  then  distortionless  compression 
would  not  be  effective  and  no  better  scheme  than  the  transmission  of  the  raw 
rc  image  bits  would  be  possible.  However,  in  typical  images  to  be  transmit¬ 
ted,  there  Is  considerable  Inherent  redundancy  in  the  raw  image.  In  these 
cases,  it  is  generally  possible  to  transmit  an  encoded  replica  of  the  image 
with  T  bits  (T  less  than  rc)  such  that  the  image  may  be  exactly  reconstructed 
using  only  the  T  transmitted  bits.  The  efficiency  of  the  compression  process 
can  be  measured  by  the  compression  ratio,  Rc,  given  by 

Rc  =  rc/T  [EQ.  A-l] 

The  SLDC  algorithm  for  the  distortionless  compression  of  an  image 
consists  of  a  Conditioned  Scan  Line  Computation  step  and  a  Compression  step. 

A-2.1  Conditioned  Scan  Line  Computation  Step. 

The  purpose  of  this  step  is  to  process  an  image  consisting  of  a  number 
of  scan  lines  and  to  generate  a  Conditioned  Image  File  (CITO ,  composed  of  a 
set  of  Conditioned  Scan  Lines  (CSLs),  subject  to  the  following  conditions: 

1.  the  original  image  can  be  reconstructed  from  the  CIF  without 
distortion,  and, 

2.  the  information  theoretic  entropy  of  each  CSL  Is  approximately 
minimized  subject  to  practical  processing  limitations. 

Although  these  two  conditions  may  appear  to  be  contradictory,  they  really  are 
not.  The  first  condition  requires  that  the  CIF,  consisting  of  the  set  of 
CSLs,  must  be  transformable  back  to  the  set  of  original  scan  lines  without 
distortion.  The  second  condition  is  concerned  only  with  the  compressability 
of  the  image  considered  one  scan  line  at  a  time.  The  conditioning  procedure 
should  transform  the  original  scan  line  into  a  CSL  having  lower  entropy  than 
the  original  when  considered  on  an  individual  scan  line  basis. 


A  -  3 


The  conditioning  algorithm  shall  generate  a  prediction  of  a  scan  line 
element  based  upon  the  state  of  the  four  neighboring  and  previously  scanned 
Image  elements  as  shown  in  Figure  A-2. 

Direction  of  scan — 

Previous  Scan  Line:  ...ABC... 

Current  Scan  Line:  .  .  .  D  X  ... 

1 — Element  to  be  predicted 
Figure  A-2  --  nearest  Neighbor  Elements 

The  CIFs  shall  be  generated  based  upon  a  State  Machine  specification 
that  shall  predict  the  state  of  each  scan  element  based  upon  Bits  A,  B,  C,  and 
D  where  "0"  and  “1"  represent  the  "expected"  and  "unexpected"  state  respec¬ 
tively.  A  ring  of  "0"  bits  shall  be  presumed  to  encircle  the  perimeter  of  the 
raw  Image.  This  Is  relevant  only  when  the  element  being  predicted  Is  on  the 
image  boundary.  The  prediction  condition  algorithm  shall  be  completely 
defined  by  a  state  table  that  has  not  yet  been  determined. 


A-2. 2  Compression  Encoding  Step. 

The  second  step  of  the  SLDC  encoding  process  represents  the  actual 
compression  process.  For  each  Conditioned  Scan  Line  (CSL)  string  to  be 
compressed,  the  following  substeps  shall  occur: 

1.  Encode  the  weight  (W)  of  CSL.  If  W*0,  proceed  to  next  CSL; 
otherwise,  proceed  to  Substep  2.)  below. 

2.  Segment  CSL  into  m  segments  of  n-bits  each  (mn=c)  where  n  is  a  design 
parameter  [NOTE:  if  c  Is  prime  or  not  divisible  by  a  convenient 
Integer,  It  may  be  necessary  to  pad  CSL  by  appending  some  "0"-b1ts  in 
order  to  achieve  a  convenient  divisor],  and, 

.  3.  Compute  and  encode  for  transmission  a  Segment  State  Word  (SSW)  for 
each  of  the  m  segments.  Each  of  the  SSW  shall  uniquely  define  the 
state  of  the  respective  segment  and  shall.  In  general,  be  encoded 
into  fewer  than  n-bits.  A  SSW  contains  the  following  two  variable 
length  fields: 

a.  WEIGHT,  the  weight  of  the  segment,  and, 

b.  RANK,  an  Index  value  which  uniquely  specifies  the  bit 
positions  of  all  "l"-bits  within  segment.  If  the 
WEIGHT -value  ■  0,  the  RANK -field  Is  omitted. 

After  the  last  non-null  segment  of  a  CSL  has  been  encoded,  the  remaining  SSWs 
may  be  omitted.  This  can  be  recognized  by  maintaining  a  running  count  of 
processed  "l"-b1ts  within  the  CSL  and  comparing  this  to  W.  The  format  of  an 
Encoded  Scan  Line  (ESL)  is  shown  in  Figure  A-3. 
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I~sca7rj  i  i  i  r  i 

I  Line  |  SSW(1)  |  SSW(2)  |  .  .  .  |  SSW(O)  |  ...  |  SSW(n) 

I  Weight! _ | _ 1  1 _ ] _ j _ 


1  WEIGHT  (JT'l  tWJTl 

Figure  A-3.  —  Encoded  Scan  Line  Format 

The  key  elements  of  the  SLDC  encoding  consists  of  the  efficient  manner  In 
which  the  WEIGHT(-)  and"RANK<«)  are  encoded.  This  is  described  in  the 
following  two  sections. 

A-2.2.1  Segment  Weight  Encoding  Procedure. 

Let  p  be  the  average  probability  of  a  T-bit  within  a  CSL.  By 
definition. 


p  *  W/mn  [EQ.  A-2] 

The  segment  weight  shall  be  encoded  using  a  variable  length  Huffman 
code.  This  type  of  code  Is  extensively  described  in  most  information  theory 
texts  (e.g.,  Ref.  2).  If  P|<  Is  defined  to  be  the  probability  that  the 
weight  of  an  n-bit  segment  is  k,  then  Huffman  coding  shall  minimize  the 
average  length  of  the  segment  weight  field.  Thus, 

n 

PkL|<  *  minimum  [EQ.  A-3] 

k=0 

where  L^  «  number  of  bits  in  WEIGHT-f  leld  specifying  a  segment  of  weight  k. 

For  reasons  to  be  explained  in  Section  A. 2.2.2,  the  maximum  weight 
segment  that  shall  be  handled  by  the  normal  SLDC  process  Is  k*  where  k'  is 
listed  In  Table  A-l.  Segments  of  weight  greater  than  k'  shall  be  handled  by 
special  procedures  to  be  described  later. 

It  shall  be  assumed  that  “the  V  •T-blts  are  blnomlally  distributed 
within  CSL.  Hence, 

P|<  «  Prob[segment  weight  *  k]  •  C(n,k)pk(l-p)n"k  [EQ.  A-4] 

where  C(n,k)  *  .  n* 

kl In-k) .' 

The  binary  Huffman  code  Is  computed  for  the  set  of  events  with  respective 
probabilities, 


Po,  Pit  •  •  .  .  Pk' .  and  Sk< 

k' 

where  S^'  *  1  -  P^« .  [EQ.  A-5] 

k=0 
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After  the  codebook  has  been  computed,  the  codewords  shall  be  used  to 
define  the  segment  lengths  for  the  current  CSL.  Note  that  p  shall  typically 
vary  from  one  scan  line  to  the  next  and  therefore  either  a  set  of  codebooks 
must  be  prestored  corresponding  to  all  possible  values  of  p  or  the  codebook 
must  be  reconputed  each  scan  line. 

For  segments  whose  weight  (k)  is  less  than  or  equal  to  k* ,  the  segment 
Might  Is  Identified  by  use  of  the  codeword  corresponding  to  probability 
Pk.  If  the  segment  weight  exceeds  k‘,  then  the  codeword  for  probability 
Sk*  shall  be  encoded  followed  by  the  n-bit  segment  in  uncompressed  form.  In 
this  case  tbe  rank  encoding  process,  described  in  the  following  section  is 
omitted. 

A-2.2.2  Rank  Encoding  Procedure. 

The  rank  of  an  n-blt  segment  of  weight  k  shall  be  defined  to  be  the 
number  of  possible  n-blt  sequences  of  weight  k  that  are  numerically  less  than 
the  segment  being  encoded.  The  rank  encoding  process  can  be  Implemented 
recursively  by  successively  scanning  the  bits  of  the  segment  from  most 
significant  bit  to  least  significant  bit  and,  whenever  a  “V'-bit  is 
encountered,  to  accumulate  a  count  of  the  number  of  patterns  which  are  already 
known  to  be  numerically  inferior  to  that  of  the  segment.  The  rank  field  is 
omitted  if  k=0  and,  if  k  is  greater  than  0,  the  rank  is  encoded  into  a  field 
of  length  [ log2C(n,k) ]-bits  where  [X]  represents  the  smallest  integer 
greater  than  or  equal  to  X. 

The  following  example  may  help  explain  the  procedure.  Let  the  segment 
length  and  weight  be  n  and  k  respectively  (k  greater  than  0).  The  process  is 
initialized  by  clearing  a  counter,  named  RANK,  to  zero.  Then,  starting  at  the 
most  significant  bit,  the  process  recursively  scans  the  segment  from  left  to 
right  as  shown  in  Figure  A-4,  and  increments  RANK  by  an  appropriate  constant 
whenever  a  T-bit  is  encountered.  For  example,  suppose  the  scanning  process 
has  progressed  to  the  point  where  Bit-I  is  about  to  be  examined  and  that  j 
T-bits  (0  j  k)  have  previously  been  encountered  in  the  scanning  process 
from  Blt-(n-l)  to  Bit-(I+1)  Inclusively.  There  are  therefore  (k-j)  T-bits 
within  the  segment  that  have  not  yet  been  discovered.  Bit-I  is  now  examined 
and.  If  Bit-I  *  0,  the  scan  process  Immediately  moves  to  the  next  bit, 
Bit-(I-T).  If  Bit-I  *  J,  then  the  current  segment  is  known  to  be  numerically 
superior  to  all  C ( I , k- j )  sequences  which  have  their  low  order  (k-j)  T-bits 
confined  to  the  I  low  order  bit  positions  (i.e.,  Bit-0  to  Bit  1-1  inclusive). 
Therefore  RANK  shall  be  incremented  by  C ( I , k- j )  and  j  shall  be  incremented  by 
1  (in  that  order)  before  proceeding  to  the  next  scan  position. 

The  rank  encoding  process  terminates  when  either  all  k  "l"-bits  have 
been  found  by  the  scanning  procedure  OR  there  are  exactly  j  T-bits  that  have 
not  yet  been  found  AND  there  are  only  j  bit  positions  remaining  to  be 
scanned.  In  the  latter  case,  all  unscanned  bit  positions  must  contain  "1"  and 
therefore  have  the  lowest  possible  rank.  Hence,  since  the  rank  increment 
would  be  zero,  the  remaining  scanning  steps  can  be  bypassed.  Annex-1  contains 
a  numerical  example  of  the  rank  encoding  process. 
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n-bit  segment  of  weight  k 


Bit  Number  |  n- 1 1  n- .  .  .  |  I  | 

1 — —  r 

Most  Current  bit 

Significant  being  examined 

Bit 

Direction  of  bit  scan  *— 


Least 

Significant 

Bit 


Figure  A-4.  —  Rank  Encoding  Scanning  Process 


It  shall  be  presumed  that  the  encoding  algorithm  Is  capable  of 
performing  arithmetic  Integer  computation  with  a  maximum  word  length  of  either 
16  or  32  bits.  If  the  segment  length  Is  greater  than  the  arithmetic  word 
length,  the  RANK -value  might  cause  an  arithmetic  overflow.  For  this  reason, 
the  value  of  k'  shall  be  limited  to  the  following  maximum  values  to  avoid  the 
possibility  of  an  overflow  condition. 


Segment  length 
In  bits 
18  or  less 
20 
24 
28 
32 
36 
40 
44 
48 
52 
56 
60 
64 


Maximum  value  of  k1 _ 

16-bit  word  length"  32 -bit  word  length 
- - - unconstrained 


unconstrained 
6 
5 
4 
4 
4 
3 
3 
3 
3 
3 
3 
3 


14 

11 

10 

9 

8 

8 

8 

7 


Table  A-l.  -  Maximum  Weight  Segment  fcrr  normal  bLDC  Processing 


A-3  RATIONALE  FOR  SLDC  ALGORITHM 


The  effectiveness  of  a  facsimile  data  compression  system  must 
necessarily  be  based  upon  the  performance/cost  tradeoff  with  respect  to 
documents  representing  a  realistic  traffic  load.  It  is  the  purpose  of  the 
present  investigation  to  subject  the  SLDC  algorithm  to  such  a  test  using 
document  Images  believed  to  be  representative  of  commercial  requirements. 

Until  the  results  of  this  Investigation  are  completed,  the  effectiveness  of 
the  SLDC  algorithm  remains  in  doubt. 

Since  the  forthcoming  investigation  shall  require  money  and  other 
^resources.  It  Is  reasonable  to  Insist  upon  some  rationale  for  the  potential 
benefits  that  may  be  gained  before  embarking  on  this  Investigation.  There  are 
an  Infinite  number  of  possible  compression  algorithms.  Why  should  the 
Government  expend  resources  to  Investigate  the  SLDC  technique?  While  an 
insistence  on  a  "proof-of-effectiveness"  as  a  prerequite  for  the  Investigation 
is  impossible  and  would  represent  a  Catch-22  dilemma,  the  following  Is  offered 
as  a  plausibility  argument  in  defense  of  the  SLDC  algorithm. 

The  SLDC  represents  a  two  "Step  process.  In  the  first  step  the  image  is 
reduced  to  a  set  of  conditioned  scan  lines  (CSLs),  each  of  which  are 
constructed  so  that  they  contain  only  the  "unexpected"  changes  In  image 
pattern  relative  to  the  previously  scanned  neighboring  image  elements.  A 
pattern  extension  that  can  be  implied  from  the  previously  scanned  Image  need 
not  be  transmitted  since  the  decoder  could  reconstruct  the  image  extension 
without  guidance  from  the  encoder.  It  Is  only  when  the  decoder  would 
incorrectly  construct  the  image  that  guidance  is  needed  from  the  encoder  to 
override  the  decoder's  default  algorithm.  Unexpected  changes  are  signalled  by 
a  binary  "1";  the  absense  of  an  unexpected  change  is  encoded  as  a  "0".  It  is 
expected  that  the  investigation  performed  under  the  Section  3.1  subtask  shall 
develop  an  efficient  procedure  that  minimizes' the  frequency  of  unexpected 
changes.  It  seems  reasonable  that  a  conditioning  technique  which  predicts  the 
binary  state  of  an  image  element  based  upon  some  weighted  average  of  four  of 
the  element's  nearest  neighbors  should  be  correct  most  of  the  time. 

•  The  second -step  of  the  SLDC  algorithm  represents  the  actual  compression 
process.  Each  CSL  to  be  compressed  Is  fragmented  Into  a  series  of  n-blt 
segments.  For  each  segment  a  variable  length  Huffman  code  Is  used  to  encode 
the  weight  <k)  of  the  segment.  Huffman  codes  are  known  to  be  optimal  in  the 
sense  that  -no  other  distortionless  code  Is  "possible  which  can  encode  the  daxa 
into  a  lesser  average  number  of  bits.  Assuming  that  each  of  the  possible 
C(n,k)  weight-k  patterns  of  length  n-bits  are  equally  likely,  the  rank 
encoding  procedures  Is  also  theoretically  optimal  In  encoding  the  position  of 
the  k  "l"-bits  in  the  segment.  Thus,  each  of  the  components  of  the  encoding 
process  Is  optimal  in  some  sense;  however,  this  does  not  imply  that  the  entire 
algorithm  Is  globally  optimal.  Indeed  It  Is  intuitively  obvious  that  some 
non-optimality  Is  Introduced  at  each  stage  of  modularization.  The  SLDC 
algorithm  has  performed  the  following  modularization  steps: 

1.  the  fragmentation  of  an  image  Into  CSLs, 

2.  the  fragmentation  of  a  CSL  into  segments,  and, 

3.  the  separate  encoding  of  segment  weight  and  specific  segment  pattern 
given  a  known  weight. 
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While  each  of  these  partitioning  operations  shall  undoubtedly  reduce  the 
compression  ratio  performance  relative  to  a  globally  optimal  compressor,  the 
modularization  is  expected  to  reduce  the  computational  complexity  and  may 
therefore  allow  the  SLDC  algorithm  to  approach  the  globally  optimal 
concession  scheme  more  closely  than  previously  implemented  compression 
scheme.  Hence,  although  the  performance  and  feasibility  of  implementation 
cannot  be  proven  at  this  point,  the  SLDC  procedure  offers  considerable  promise 
and  is  considered  worthy  of  a  empirical  demonstration  which  shall  evaluate  its 
effectiveness  relative  to  other  compression  schemes. 


ANNEX-1  —  NUMERICAL  EXAMPLE  OF  RANK  ENCODING  PROCESS 


In  this  Annex  the  rank  encoding  of  a  9-bit  segment  of  weight  3  shall  be 
evaluated.  In  this  case,  n=9  and  k*3.  Suppose  the  segment  pattern  were 
010100100  and  let  us  establish  the  bit  numbering  convention  as  shown  below. 

Bit  Number:  876543210 
Bit  State:  0  10  10  0  10  0 

Initialize  by  setting  RANK -0,  j*0  (Number  of  previously  encountered  "l*  bits), 
and  the  bit  scan  position  to  Bit-8. 


Scan 

Position 

(I) 

Bit 

State 

No.  of  Found 
-T-Bits 

C(I,k-j) 

Rank  Encoding  computation 

8 

0 

0 

0 

56 

Rank  *  o  (Initialization) 
rank  -  0 

7 

1 

0 

35 

RANK  -0+35-35 

6 

0 

1 

15 

RANK  -35 

5 

1 

1 

10 

RANK  -  35  +  10  =  45 

4 

0 

2 

4 

RANK  -  45 

3 

0 

2 

3 

RANK  -  45 

2 

1 

2 

2 

RANK  *  45  +  2  =  47 

1 

0 

3  (j  =  k: 

TERMINATE  with  RANK  -  47) 

The  encoded  length  of  the  RANK  field  is  [ log2C(9,3)]=[log284]=[6.394>7. 
Thus  the  RANK -field  is  encoded  as  0101111. 


To  validate  this  example,  the  table  below  lists  all  possible  n=9,  k=3  patterns 
in  numerically  Increasing  order  up  to  and  including  the  test  case  (010100100). 


RANK 

PATTERN 

RANK 

PATTERN 

RANK 

PATTERN 

0 

000000111 

16 

000110001  ~ 

32 

0011001ft) 

1 

000001011 

17 

0001 10010 

33 

001 101000 

2 

000001 101 

18 

000110100 

34 

001110000 

3 

000001110 

19 

0001T1000 

35 

01000001 1 

4 

000010011 

20 

001000011 

36 

010000101 

5 

000010101 

21 

001000101 

37 

0100001 10 

6 

ooooiono 

22 

001000110  . 

38 

010001001 

7 

00001 1001 

23 

OOTOOlOOl 

39  ' 

010001010 

8 

000011010 

24 

001001010 

40 

oiooonoo 

9 

00001 1 100 

25 

001001100 

41 

010010001 

10 

00010001 1 

26 

001010001 

42 

010010010 

11 

000100101 

27 

001010010 

43 

010010100 

12 

000100110 

28 

001010100 

44 

01001 1000 

13 

000101001 

29 

001011000 

45 

010100001 

14 

000101010 

30 

001 100001 

46 

010100010 

15 

000101100 

31 

001 100010 

1*7“ 

ClOlOOlOOl  --  Test  Case  Segment 
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PROGRAM:  CHART 


DESCRIPTION: 

This  program  reac/  an  image  input  file  one  line  at  a  time 
recording  occurrences  of  each  nearest  neighbor  state  vector  and  state 
of  each  bit  for  every  bit  in  the  file.  The  results  are  sent  to  the 
line  printer  in  the  form  of  Table  1  in  Appendix  A.  Also  an 
occurrence-black-white  table  of  the  same  nature  is  calculated. 

CALLING  SEQUENCE 

CHART,  <INPUT  NAME>  ,  PROBABILITY  FILE  NAME> 

INPUT  NAME  -  Input  Image  File 

OUTPUT  NAME  -  File  consisting  of  a  “1"  or  "0"  for  each  neighbor 
template  depending  on  probabilities  calculated.  This 
file  will  be  used  in  Conditioned  Image  File  creation 
and  restoration. 

ORDER  OF  PARAMETERS: 

1)  Number  of  records  in  input  file 

2)  CCITT  File  # 

3)  File  Resolution 
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PROGRAM:  CREATE 


DESCRIPTION: 

This  program  creates  a  CIF  along  with  statistics  requested  in 
subtask  1  (see  Section  2.0).  The  file  generated  in  CHART  is  read  in 
and  used  to  generate  the  CIF. 

CALLING  SEQUENCE 

CREATE, <INPUT  NAME> ,  <OUTPUT  NAME> ,  < PROBABILITY  FILE> 

INPUT  NAME  -  Input  Name  of  Image 
OUTPUT  NAME  -  Output  Name  of  CIF 
PROBABILITY  FILE  -  File  Generated  by  CHART 

ORDER  OF  PARAMETERS 

)1  #  Word  per  output  record 

2 )  #  records  to  be  output 

3)  CCITT  file  number 

4)  File  resolution 

5)  #  records  in  input  file 
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PROGRAM:  RESTORE 


DESCRIPTION: 

This  program  restores  the  CIF  back  to  the  original  image  file. 
The  probability  file  used  to  generate  the  CIF  is  read  in  and  used  to 
restore  the  file  to  its  original  state. 

CALLING  SEQUENCE: 

RESTORE  -  < INPUT  NAME>,  <OUTPUT  NAME> ,  < PROBABILITY  FILE> 

INPUT  NAME  -  Input  CIF  Name 

OUTPUT  NAME  -  Restored  Output  Image  File 

PROBABILITY  FILE  -  File  used  to  create  CIF;  file  is  calculated 

in  program  CHART. 

ORDER  OF  PARAMETERS 

1 )  #  of  words  to  be  output 

2 )  #  records  to  be  output 

3)  #  records  in  input  file 
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PROGRAM:  ENCODE 


DESCRIPTION: 

This  program  encodes  an  input  CIF  into  an  output  encoded  file. 
Before  encoding  each  line,  a  codebook,  as  discussed  in  Section  3.0,  is 
generated  and  stored.  The  output  file  will  be  written  to  disk  if  an 
output  file  is  requested.  There  is  also  an  option  to  print  line  by 
line  compression  statistics. 

CALLING  SEQUENCE 

ENCODE  -  < INPUT  NAME> , < OUTPUT  NAME>,  <C0DE  FILE> , 

< WEIGHT  FILE> 

INPUT  NAME  -  Input  CIF  Name 
OUTPUT  NAME  -  Output  Encoded  File  Name 
ORDER  OF  PARAMETERS 

1)  #  records  per  output  file 

2)  Decision  for  output  file 

3)  Max  weight  for  segment  (Table  3.1) 

4)  CCITT  file  number 

5)  File  resolution 

6 )  Arithmetic  word  length 

7 )  #  words  per  input  record 

8)  #  records  in  input  file 

9)  Segment  length  to  be  compressed 

10)  Decision  for  line  by  line  compression  stats. 


PROGRAM:  DECODE 


DESCRIPTION: 

Program  DECODE  decodes  the  encoded  file  back  into  a  CIF.  The 
prestored  codebook  generated  in  program  encode  is  read  in  and  used  for 
decoding  the  segment  weights . 

A  linked  Huffman  decoding  tree  is  read  in  which  finds  the  Huffman 
code  used  for  the  segment  weight  coding.  The  Huffman  code  is  then 
looked  up  in  the  prestored  codebook  to  find  the  segment  weight.  Once 
the  weight  of  the  segment  is  found  the  rank  can  be  easily  calculated. 

CALLING  SEQUENCE: 

DECODE  < INPUT  NAME> ,  <OUTPUT  NAME> ,  < WEIGHT  FILES > ,  <TREE  FILES> 
INPUT  NAME  -  Encoded  Input  CIF  Name 
OUTPUT  NAME  -  Decoded  Output  CIF  name 

ORDER  OF  PARAMETERS 

1)  #  words  per 'output  record 

2)  #  records  to  be  output 

3 )  #  words  per  i  lput  record 

4)  #  records  in  input  file 

5)  Segment  length  to  be  compressed 

6)  Max  weight  encodable  (Table  3.1) 
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